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1.  SUMMARY 


In  the  spring  of 2000,  Atmospheric  and  Environmental  Research ,  Inc.  (AER)  was 
awarded  a  contract  by  the  Air  Force  Research  Laboratoiy  for  a  proposal  submitted  under 
the  Broad  Agency  Announcement  to  conduct  research  and  development  work.  AER’s 
contribution  to  this  effort  would  be  performed  as  a  team  member  in  a  Department  of 
Defense  (DoD)  High  Performance  Computing  and  Modernization  Office  (HPCMO) 
project  aimed  at  improving  the  quality  and  usefulness  of  weather  forecast  data  in  support 
of  the  USAF  mission.  The  armed  forces  currently  devote  a  large  amoimt  of  resources  to 
the  timely  collection  and  dissemination  of  weather  information  to  minimize  negative 
weather  impacts  on  the  warfighter  and  to  use  weather  knowledge  as  a  force  multiplier.  A 
core  activity  at  military  weather  centers  involves  the  use  of  an^ysis  and  forecast  models. 
The  quality  of  the  products  produced  by  these  analysis  and  forecast  models  is  highly 
dependent  upon  the  conventional  observing  network  (e.g.,  radiosondes  and  surface 
observing  stations).  However,  this  network  is  not  capable  of  sampling  the  atmosphere 
with  the  needed  temporal  and  spatial  resolution  to  accurately  resolve  the  theater-scale 
weather  patterns  of  interest.  )^le  remotely  measured  weather  data  have  the  potential  to 
overcome  this  deficiency,  its  use  in  theater-scale  models  is  problematic.  Among  the 
difficulties  is  the  fact  that  the  remotely  measured  quantity  must  first  be  converted  into  one 
of  the  model’s  native  dependent  variables — often  with  a  loss  of  accuracy.  The  four¬ 
dimensional  variational  (4d-Var)  approach  can  be  used  to  overcome  this  difficulty  since  it 
can  directly  assimilate  any  measured  quantity.  However,  the  computational  demands  of 
this  approach  far  exceed  the  compute  cycles  available  at  the  typical  weather  center. 
During  this  project,  we  developed  a  highly  scalable  version  of  a  4d-Var  application  that 
has  the  potentid  to  execute  within  the  time  constraints  of  an  operational  center.  Its 
contribution  to  the  objectives  of  dominant  battlespace  awareness  and  information 
superiority  espoused  in  Joint  Vision  2010  can  be  considerable.  It  seeks  to  more  fully 
exploit  vital  space-based  environmental  monitoring  assets  to  improve  situational 
awareness,  mission  planning,  and  weapon  system  execution.  The  application  developed 
under  this  project  is  state-of-the-art,  consists  of  a  meteorological  analysis  code  that 
provides  accurate  depictions  of  the  state  of  the  atmosphere,  has  been  applied  successfully 
to  a  large  number  of  cases,  and  the  results  have  been  documented  in  peer  reviewed 
forums.  Before  this  project,  this  code  was  optimized  for  vector  computer  architectures. 
By  reengineering  this  code  to  make  it  “scale”  (i.e.,  performance  increases  linearly)  as 
additional  CPUs  are  devoted  to  the  calculations,  significant  speedups  were  achieved  on 
the  class  of  computers  referred  to  as  Massively  Parallel  Processors  (MPP).  The  4d-Var 
technique  performs  a  series  of  iterations  that  require  computational  power  measured  in 
the  tens  of  gigaflops  for  real-time  application,  ftoven  strategies  and  techniques  were 
employed  to  develop  the  scaleable  version  of  this  code.  Among  the  coding  strategies 
employed  was  domain  decomposition.  The  project  reached  all  of  its  critical  test  goals, 
which  included  tests  for  scalability,  portability,  and  correctness.  Test  results  were 
confirmed  by  NCAR.  For  scalability,  the  wall  clock  time  of  the  scalable  MM5v3  4d-Var 
was  reduced  by  factors  of  up  to  36  times  (speedup  was  case  dependent)  fi-om  that  of  the 
baseline  on  64  nodes  of  an  IBM  SP  P3  system,  and  by  factors  of  up  to  196  times  on  64 
nodes  of  a  Compaq  ES-45  system.  Correcmess  was  measured  by  comparison  with  the 
MMSvl  4d-Var  code  on  a  single  processor,  and  was  fovmd  to  differ  by  less  than  3.5 
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percent.  Correctness  was  also  measured  in  terms  of  differences  between  multiprocessor 
and  single  processor  runs,  where  differences  were  less  than  1  percent.  A  tested,  scalable 
4d-Var  code  was  delivered,  along  with  periodic  progress  reports,  and  full  user 
documentation.  During  the  later  half  of  this  contract,  we  investigated  the  potential  impact 
of  optical  turbulence  data  on  upper-air  data  analysis.  The  task  involved  developing 
additional  software  for  the  MM5  4d-Var  code  and  applying  it  to  data  obtained  by 
AFRL.  The  preliminary  results  from  that  exercise  suggest  that  data  have  the  potential 

to  improve  upper-atmospheric  analyses  and  the  NWP  model  forecasts  made  from  them. 

2.  INTRODUCTION 

This  section  provides  some  background  and  introductory  material.  The  subsequent 
sections  present  a  summary  of  the  activities  conducted  for  this  project.  In  section  3  we 
will  present  the  methods  and  procedures  used  to  develop,  test,  and  evaluate  the  scalable 
MM5v3  4d-Var  code.  Section  3.1 1  includes  the  work  on  the  optical  turbulence  task. 
Section  4  contains  test  results  and  discussion.  Concluding  remarks  are  presented  in 
Section  5. 

2.1  DoD  High  Performance  Computing  Modernization  Office 

The  Office  of  the  Secretary  of  Defense  is  investing  significant  resources  in  high 
performance  computing  to  provide  the  United  States  military  with  a  technological 
advantage  to  support  warfighting  requirements.  The  High  Performance  Computing 
Modernization  Program  (HPCMP;  http://www.hpcmo.hpc.mil)  provides  advanced 
hardware,  computing  tools  and  training  to  DoD  researchers  utilizing  the  latest  technology 
to  aid  their  mission  in  support  of  the  warfighter.  The  program  has  three  initiatives: 

(1)  High  performance  computing  centers  which  consist  of  major  shared 
resource  centers  (MSRCs)  and  distributed  centers  (DCs), 

(2)  The  Defense  Research  and  Engineering  Network  (DREN), 

(3)  The  Common  High  Performance  Computing  Software  Support  Initiative 
(CHSSI). 

CHSSI  will  next  be  described  in  more  detail. 

2.1.1  Common  High  Performance  Computing  Software  Support  Initiative 

CHSSI  is  an  application  software  development  component  of  the  HPCMP  that 
provides  DoD  research  scientists  and  engineers  vwth  technical  codes  that  exploit  scalable 
computing  systems.  The  CHSSI  applications  are  selected  based  on  their  critical  need. 
These  products  facilitate  a  large  fraction  of  the  DoD  science  and  technology  and 
developmental  test  and  evaluation  computational  workload  in  support  of  DoD 
warfighting  requirements. 

In  January  1999,  the  Air  Force  Research  Laboratory  teamed  with  Atmospheric  and 
Environmental  Research,  Inc.  (AER),  Florida  State  University  (FSU),  and  the  National 
Center  for  Atmospheric  Research  (NCAR)  and  submitted  a  proposal  to  create  a  version  of 
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the  MM5  4d-Var  analysis  application  (see  below)  that  scales  on  a  class  of  computers 
known  as  Massively  Parallel  Processors  (MPP).  This  software  would  enable  its  users  to 
achieve  runtime  efficiencies  that  would  make  possible  operational  implementation  of  the 
4d-Var  technique.  In  November  1999,  the  HPCMP  Office  informed  the  team  that  the 
proposal  was  accepted,  and  work  began  in  February  2000.  The  project  was  the  fifth 
selected  under  the  Climate/Weather/Ocean  Computational  Technical  Area  (CTA)  of 
CHSSI,  and  is  identified  as  CWO-5. 

2.2  DoD  Weather 

DoD  operates  a  military  environmental  service  system  to  provide  specialized 
worldwide  meteorological,  space  environmental,  and  oceanographic  analysis  and 
prediction  services  in  support  of  military  forces.  This  system  directly  supports  all  phases 
of  military  operations,  from  strategic  planning  to  tactical  operations.  While  the  Amiy  and 
Marine  Corps  each  have  a  small,  specialized  weather  support  capability,  the  Naval 
Meteorology  and  Oceanography  Command  and  Air  Force  Weather  are  the  primary 
sources  of  military  weather  products.  The  military  weather  services  contribute  to  the 
national  and  international  weather  observing  capability  by  taking  conventional 
observations  on  land  and  at  sea  where  there  are  no  other  conventional  weather  observing 
capabilities  and  where  the  observations  are  most  needed  to  meet  military  requirements.  In 
addition,  DoD  maintains  specialized  observing  capabilities,  such  as  the  Defense 
Meteorological  Satellite  and  Global  Weather  Intercept  Programs,  to  meet  unique  military 
requirements.  Observational  data  are  sent  by  military  communications  networks  to 
military  and  civil  facilities  in  the  United  States  and  overseas. 

2.2.1  Air  Force  Weather  Agency 

The  Air  Force  Weather  Agency  (AFWA)  is  a  field  operating  agency  that  reports  to 
HQ  USAF/XOW,  the  Deputy  Chief  of  Staff  for  Air  and  Space  Operations.  AFWA 
provides  strategic-level  weaker  support  (global  and  synoptic  scale)  for  their  worldwide 
customers,  and  fulfills  other  unique  mission  requirements.  AFWA  is  the  primary 
production  center  for  providing  weather  analyses  and  forecasts  for  Air  Force  and  Army 
operations.  Worldwide  weather  data  are  relayed  to  AFWA  and  blended  with  civil  and 
military  meteorological  satellite  data  to  construct  a  real-time,  integrated  enviromnental 
database.  Computer  programs  digest  the  data  and  process  it  with  models  of  the 
atmosphere  to  forecast  its  future  behavior. 

Our  primary  intended  customer  for  this  project  is  the  AFWA.  During  the  course  of 
this  project,  we  considered  the  present  AFWA  operating  environment  and  likely  future 
capabilities  and  factor  them  into  the  software  development  process.  An  example  of  this 
would  be  the  choice  of  a  computer  platform  on  which  to  run  the  4d-Var  code;  we  selected 
an  IBM  SP-class  of  computer  since  this  is  the  type  of  system  presently  in  use  at  AFWA. 
We  also  assumed  that  AFWA’s  computational  facilities  would  keep  pace  with  trends  in 
the  computing  world,  where  CPU  processing  speeds  are  doubling  approximately  every  18 
months.  This  would  enable  AFWA  to  implement  4d-Var,  or  some  other  variational 
method  into  operations  in  the  2005  time  frame. 
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2.3  AFRL  CHSSI  Project 

This  section  provides  information  on  the  CHSSI  CWO-5  team,  the  Statement  of 
Work  defining  AER’s  role  in  the  CWO-5  project,  the  migration  of  the  4d-Var  code  to 
version  3  of  the  MM5,  some  relevant  distributed  memory  (DM)  issues,  and  the  funding 
mechanism. 

2.3.1  CHSSI  Project  Team 

The  project  team  for  the  CHSSI  CWO-5  project  is  shown  in  Table  1. 


Table  1.  CHSSI  CWO-5  project  team  and  basic  responsibilities 


Name 

Responsibilities 

Dr.  Frank  Ruggiero 
AFRLWSBYA 

Project  Leader,  overall  project  management,  interface  to 

HPCMO,  coordination  of  efforts  of  various  project  members, 
writine  scalable  code,  and  post  grant  software  maintenance 

Dr.  Thomas  Nehrkom 
Atmospheric 

Environmental  Research 

Meteorolouist/Prourammer,  writing,  testing  scalable  code; 
support  reviews,  code  releases;  maintain  software  repository 

Mr.  George  Modica 
Atmospheric 

Environmental  Research 

Meteorologist/Proarammer.  writing,  testing  scalable  code; 
support  reviews,  code  releases,  documentation;  maintain 
document  repository 

Mr.  Ernesto  Sendoya 
Atmospheric 

Environmental  Research 

Software  Engineer,  software  design,  documentation,  and  review; 
monitor  adherence  to  software  engineering  standards 

Dr.  Xiaolei  Zou 

Florida  State  University 

4d-Var  Expert,  responsible  for  conducting  baseline  test  of  the 
original  4d-Var  software;  development  of  incremental  driver, 
bogus-data  vortex  assimilation  codes 

Mr.  John  Michalakes 
National  Center  for 
Atmospheric  Research 

Software  Engineer/Programmer,  manage  code  parallelization 
strategy  and  implementation,  test  and  evaluation 

The  management  structure  of  this  project  is  illustrated  in  Figure  1.  AFRL  served  as 
the  overall  program  manager  and  interface  to  the  HPCMO,  with  Florida  St.  University  (X. 
Zou)  and  NCAR  (J.  Michalakes)  providing  technical  management  of  the  4d-Var  code. 
AER  was  responsible  for  software  engineering  activities,  including  code  development, 
testing,  and  documentation.  AER  also  supported  reviews  and  provided  software  and 
document  releases.  While  all  three  parties  participated  in  different  aspects  of  software 
development,  AER  played  the  coordinating  role. 
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Figure  1.  Illustration  of  the  Management  Structure  Employed  for  CWO-5. 

2.3.2  Variational  Method 

The  variational  method  solves  a  minimization  problem,  in  which  a  model  initial 
state  x(ro )  is  found  which  minimizes  an  objective  function.  In  meteorological 

applications,  the  cost  function  often  takes  a  form  similar  to  the  following  (see  also,  for 
example,  Ide  et  al.  1997): 

[x(?o)-x'’(/o)]"B;'[x(to)-x'’(?o)]+^Z(y,-y:)  R;'(y,-y:)  (i) 

(=0 

where  X**  is  the  a  priori  background  field  with  assumed  error  covariance  Bo,  and  y“ 
denotes  the  vector  of  observations  at  time  t.,  and  R.  is  the  corresponding  observation 
error  covariance  matrix.  The  simulated  observations  y .  are  obtained  by  applying  the  [in 
general,  nonlinear]  observation  operator  to  the  model  predicted  variables: 
y,  =  The  minimization  is  performed  over  an  analysis  time  window  ■ 

The  3d-Var  algorithm  can  formally  be  written  in  the  same  way,  except  that  the  sum  over  i 
in  the  second  RHS  term  in  (1)  is  replaced  by  a  single  term;  that  is,  only  the  observations 
at  time  are  considered  in  the  minimization.  In  practice,  observations  within  a  data 


4x(^o)]  =  ^ 
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cutoff  window  around  are  grouped  together  for  the  nominal  analysis  time,  and  the 

background  field  is  interpolated  in  time  and  space  to  the  observation  location  and  time. 

Minimization  of  the  variational  analysis  problem  requires  an  estimate  of  the 
gradient  of  J  with  respect  to  the  solution  vector  x(/o),  which  is  most  efficiently 
computed  with  the  adjoint  of  the  observation  operator  (and,  for  4d-Var,  the  adjoint  of  the 
forecast  model).  The  MM5  4d-Var  system  is  described  in  Section  2.3.3. 

2.3.3  MM5  4d-Var  System 

During  a  three-year  period  ending  in  1997,  the  Microscale  and  Mesoscale 
Meteorology  Group  at  the  National  Center  for  Atmospheric  Research  (NCAR)— under 
the  support  of  the  National  Science  Foundation,  the  Federal  Aviation  Administration,  and 
the  Department  of  Energy — developed  a  mesoscale  data  assimilation  system  based  on  the 
nonhydrostatic  version  of  MM5  and  its  adjoint  (Zou  et  al.  1998).  The  initial  version  of  the 
MM5  4d-Var  system  was  coded  for  single  processor  computer  architectures  and  its 
nonlinear,  tangent-linear,  and  adjoint  components  were  based  on  Version  1  of  the  MM5 
(MMSvl).  The  MMSvl  4d-Var  system  included  a  bulk  aerodynamic  formulation  of  the 
planetary  boundary  layer,  a  dry  convective  adjustment,  grid-resolvable  large-scale 
precipitation  and  a  Kuo-type  cumulus  parameterization  in  addition  to  the  model 
dynamics.  The  MMSvl  4d-Var  has  since  been  updated  to  include  the  Grell  cumulus 
parameterization,  Dudhia’s  explicit  moisture  scheme,  and  a  radiative  upper  boundary 
condition.  The  latest  version  also  has  the  capability  to  use  8-byte  MM5v2  input  files  as 
well  as  standard  MM5v3  input  files  and  to  resume  minimization  from  a  restore  file.  The 
MMSvl  4d-Var  can  operate  on  multiple  platforms,  such  as  the  DEC  (Compaq)  Alpha, 
SGI,  and  PC-Linux. 

The  background  error  covariance  Bo  in  (1)  is  used  to  weight  errors  in  the  features  of 
the  background  field  relative  to  the  observations.  The  assimilation  system  will  assign  less 
weight  to  those  structures  with  large  error  relative  to  more  accurately  known  background 
features  and  observations.  Evaluation  of  Bo  requires  the  inverse  of  the  background  error 
covariance  matrix.  With  the  length  of  the  model  state  vector  being  on  the  order  of  10  , 
direct  evaluation  and  storage  of  this  matrix  is  computationally  prohibitive.  There  are  a 
number  of  different  procedures  to  make  the  background  weighting  problem  more 
tractable.  The  approach  taken  in  the  MMSvl  4d-Var  system  is  to  assume  Bo  is 
approximately  diagonal  in  full-field  space.  A  more  elegant  approach  is  taken  e.g.,  in  the 
WRF  3d-Var  data  assimilation  through  the  introduction  of  a  variable  transformation,  6w 
=  Uv,  so  that  the  background  error  covariance  matrix  is  approximately  diagonal  in  v 
space.  With  proper  normalization  of  U,  the  backgroimd  error  covariance  matrix  can  then 
be  approximated  by  the  identity  matrix.  This  method  has  been  used  successfully  in 
NCEP’s  spectral  statistical  interpolation  (SSI)  scheme  (where  the  spectral  modes  are 
assumed  to  have  imcorrelated  error);  similar  approaches  are  in  use  at  the  ECMWF,  and 
the  mesoscale  3d-Var  system  developed  at  NCAR. 

In  general,  the  error  structures  represented  by  the  backgroimd  error  covariance 
matrix  are  flow  dependent  and  change  from  day  to  day  depending  on  the  synoptic  regime 
and  other  factors.  However,  providing  background  “errors  of  the  day”  can  be  a  costly 
endeavor,  and  is  not  easily  implemented  in  practice.  One  way  of  approximating  the 
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background  error  covariance  data  is  with  the  “NMC-method”  (Parrish  and  Berber  1992). 
In  this  method  the  background  errors  are  approximated  from  averaged  forecast 
differences,  e.g., 


B  =  (x‘  - -»')'=  Vl  •  -  J"*”  1 .  (2) 

where  x*  is  the  background  field,  x'  is  the  true  atmospheric  state,  and  is  the 

background  error.  The  overbar  in  (2)  represents  an  average  in  time  and/or  space.  The 
WRF  3d-Var  includes  tunable  background  error  files  that  were  computed  for  a  variety  of 
horizontal  spatial  resolutions  and  for  the  major  seasons  of  the  year.  The  MM5vl  4d-Var 
application  only  has  the  option  to  use  “direct  observations,”  which  consists  of  gridded 
analysis  fields. 

2.3.4  Statement  of  Work 

The  technical  implementation  plan,  or  statement  of  work  (SOW)  for  this  contract, 
concentrated  on  the  development,  vdidation,  and  demonstration  of  a  scalable  system  for 
the  4d-Var  data  analysis  of  satellite  radiance  data.  An  overarching  goal  for  CWO-5  was 
to  develop  an  analysis  system  that  achieves  speed-up  in  wall-clock  time  sufficient  to 
make  feasible  its  use  in  an  operational  weather  analysis  and  forecast  center.  To  meet  this 
operational  requirement,  we  set  an  objective  for  the  baseline  system  to  achieve  a  wall- 
clock  time  speedup  of  32  to  64  times  that  of  the  current  version.  While  the  4d-Var  system 
was  designed  to  run  on  as  many  platforms  as  possible,  development  was  focused  on  those 
platforms  commonly  available  to  potential  users  of  the  system  such  as  AFWA  and  others 
in  the  operational  and  research  communities,  both  inside  and  outside  DoD.  The  analysis 
system,  which  uses  a  NWP  model  to  constrain  the  analyzed  state,  includes  physics 
parameterizations  that  are  consistent  with  current  mesoscale  NWP  models. 

The  MM5vl  version  of  the  4d-Var  system  runs  on  a  vector-class  machine  using  the 
non-hydrostatic  form  of  the  governing  equations  and  simple  physics  parameterizations. 
The  programming  strategy  followed  an  incremental  series  of  four  builds:  The  first  build 
provided  the  candidate  baseline  code  for  the  Software  Acceptance  Test  (SAT).  This  build 
was  comprised  of,  in  essence,  the  NCAR  MM5vl  4d-Var  code  (see  2.3.3)  but  ported  to 
run  on  a  single  node  of  a  MPP  machine.  The  second  build,  the  Alpha  Test  Code,  included 
scalable  non-linear  and  tangent-linear  models,  or  NLM  and  TLM,  respectively.  The  Beta 
Test  Code,  or  third  build,  completed  the  development  of  the  scalable  4d-Var  system  by 
including  a  scalable  adjoint  model  (ADJ)  system.  The  Beta  build  included  radiosonde, 
surface,  and  satellite  observation  operators  that  would  permit  the  use  of  those 
observations.  The  final  build  was  the  Initial  Operation  Capability  (IOC)  build.  The  IOC 
version  implements  the  incremental  driver  and  bogus  data  vortex  assimilation  options. 

All  problems  encoimtered  during  the  extensive  beta  testing  are  corrected  and  incorporated 
into  the  IOC  build,  and  any  additional  physics  upgrades  have  been  added  as  options  to  the 
system  at  this  time.  While  the  IOC  code  was  not  subject  to  review,  it  was  prepared  with 
the  same  software  management  methodology  as  in  other  releases. 

The  Work  Breakdown  Structure  (WBS)  provides  the  backbone  of  the  development 
path  for  the  CWO-5  code,  and  is  included  in  the  Software  Development  Plan  described 
later  in  Section  3.2.  As  illustrated  in  Figure  1,  each  team  member  had  a  defined  role  in 
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overall  4d-Var  code  development.  The  portion  of  the  WBS  that  came  under  AER  s 
responsibility  is  included  in  the  original  proposal  for  this  project.  That  SOW  is  excerpted 
in  Sections  2.3.4. 1  to  2.3.4.5  below. 

2.3.4.1  Alpha  Test  Code  Build 

During  this  phase,  the  CHSSI  team  was  charged  with  further  development  and 
improvement  of  the  baseline  code  to  the  levels  required  for  Alpha  release.  AER 
commenced  efforts  to  develop  scalable  forward  model  components  of  the  4d-Var  system. 
The  components  are  spelled  out  in  the  Software  Development  Plan  and  are  summarized 
below: 

•  Non-linear  Forecast  Model:  Download  and  run  the  most  recently  available 
version  of  the  MM5  non-linear  model  (NLM)  in  order  to  familiarize  the 
CWO-5  programming  team  with  the  constructs  used  there  to  implement 
scalability.  This  is  the  same-source  paradigm  used  in  the  MM5  NLM  to 
enable  parallelization  of  the  executable  code  while  still  having  one  source 
code  for  both  single  and  multiprocessor  platforms  (described  in  Section 
2.3.6).  Ensure  agreement  between  output  of  serial  code  and  parallel  codes 
running  both  on  a  single  and  on  multiple  processors.  Incorporate  into  the 
CWO-5  4d-Var  system  updated  versions  of  the  NLM  code,  as  these  updates 
become  available. 

•  Tangent-linear  Forecast  Model:  Develop  a  new  TLM  based  on  the  release 
of  MM5  used  as  the  NLM.  Make  use  of  tangent-linear  and  adjoint 
compilers  in  this  effort  (described  later  in  Section  2.3.5).  Make  the  tangent- 
linear  model  scalable  in  a  manner  similar  to  that  used  to  build  the  same- 
source  parallel  NLM:  Incorporate  same-source  parallel  modifications  to  the 
executable  build  structure  by  adding  the  FORTRAN  Loop  and  Index 
Converter  (FLIC;  htQ)://  www-unix.mcs.anl.gov/~michalak/flic/)  and 
Runtime  System  Library  (RSL;  http://www- 

unix.mcs.anl.gov/~michalak/rsl/)  libraries.  Perform  data  decomposition  of 
static  data  structures.  Analyze  data  dependencies  in  subroutines  and 
implement  inter-processor  communication.  Adapt  I/O,  model  initialization, 
and  namelist  configuration  to  parallel  architecture.  Obtain  agreement 
between  output  of  non-parallel  code  and  both  parallel  code  running  on  a 
single  processor  and  parallel  code  running  on  multiple  processor  through 
substantial  testing.  Conduct  performance  optimization. 

•  Documentation:  Produce  initial  version  of  user’s  guide  for  alpha  testers. 
Documentation  is  to  include  instructions  on  installation,  configuration, 
execution,  and  test  cases. 

The  primary  AER  activity  during  this  development  phase  consisted  of  preparing  a 
TLM  code  based  on  MM5v3  of  the  NLM.  The  TLM  modules  were  then  subject  to  a 
series  of  correctness  tests.  After  the  TLM  passed  the  correctness  tests,  Ae  codes  were 
handed  over  to  our  NCAR  partner  for  coding  changes  that  would  permit  scaling  on  MPP 
systems.  These  coding  changes  implement  domain  decomposition.  To  accomplish  this 
our  NCAR  partner  utilized  the  RSL,  which  makes  calls  to  the  Message  Passing  Interface 
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(MPI;  http://www-iinix.mcs.anl.gov/mpi/)  standard  and  the  FLIC.  These  coding  changes 
were  applied  to  both  the  NLM  and  the  TLM  codes.  At  this  time,  the  CWO-5  team 
selected  a  subset  of  users  to  serve  as  formal  Alpha  testers  and  established  procedures  for 
monitoring  Alpha  release  and  user  feedback.  When  (1)  the  code  consistently  satisfied  the 
performance  criteria  required  by  the  TEMP  Addendum,  (2)  procedures  were  in  place  to 
adequately  track  code  releases  and  gather  user  feedback,  and  (3)  the  code  was  running 
well  on  at  least  one  DoD  HPC  platform,  we  scheduled  the  Alpha  Test  Review  with  the 
HPCMO  program  office.  The  project  team  provided  the  CHSSI  Project  Manager  with  the 
necessary  documentation  and  a  completed  testing  checklist  at  the  Alpha  test  review. 
During  this  period,  AER  served  as  the  Software  Configuration  Management  (SCM) 
Manager  and  had  the  responsibility  for  ensuring  the  proper  entries  were  made  in  the 
Software  Development  Library  (SDL).  This  includes  generation  of  documentation,  test 
reports,  software,  and  any  other  pertinent  materials. 

2.3.4.2  Beta  Test  Code  Build 

After  the  Alpha  Review,  the  team  examined  the  Alpha  code  and  addressed  the 
ftmctionality  and  usability  issues,  identify  bugs,  inconsistencies,  confusing  points,  etc. 
These  changes  are  recorded  in  the  history  log  maintained  by  the  Concurrent  Version 
System  (CVS;  htq)://www.cvshome.org/)  for  each  of  the  4d-Var  modules.  At  the  same 
time,  the  development  team  began  planning  for  new  Beta  code  development  activities  by 
integrating  new  functionality,  fixes,  as  well  as  changes  identified  during  the  Alpha  test 
period.  The  most  notable  change  was  the  implementation  of  the  parallel  ADJ  code  by  our 
NCAR  team  member.  During  this  phase,  AER  prepared  software  documentation.  When 
the  code  achieved  "full"  functionality  on  two  or  more  DoD  HPC  platforms  and  was  in  a 
usable  state  by  the  CTA  community  (Computational  Technology  Area;  in  our  case 
Climate/Weather/Oceanography),  the  team  scheduled  the  Beta  test  review.  The 
components  are  spelled  out  in  the  Software  Development  Plan  and  are  summarized 
below: 

•  Non-linear  Forecast  Model:  Incorporate  bug  fixes  encountered  in  the 
Alpha  Test  Code; 

•  Tangent-linear  Forecast  Model:  Incorporate  bug  fixes  encountered  in  the 
Alpha  Test  Code; 

•  Adjoint  Model:  Develop  a  new  MM5  adj  oint  model  based  upon  the  latest 
release  of  MM5  used  as  the  NLM.  Incorporate  same-source  parallel 
modifications  to  the  executable  build  structure  by  adding  the  FLIC  and  RSL 
libraries.  Perform  data  decomposition  of  static  data  structures.  Modify 
loops  and  index  arithmetic  to  re-establish  “Owner  Computes”  and  to 
eliminate  instances  of  false  recursion  introduced  with  the  creation  of  the 
adjoint  model.  Make  modifications  to  FLIC  to  automate  code  changes  in  the 
previous  step.  Analyze  data  dependencies  in  subroutines  and  implement 
inter-processor  communication.  Adapt  I/O,  model  initialization,  and 
namelist  configuration  to  parallel  architecture.  Obtain  agreement  between 
output  of  non-parallel  code  and  both  parallel  codes  running  on  a  single 
processor  and  parallel  code  running  on  multiple  processor  through  careful 
testing.  Conduct  performance  optimization; 
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•  Raob  Observation  Operator:  Develop  observation  operator  for  radiosonde 
observations; 

•  Satellite  Observation  Operator:  Develop  observation  operator  for  GOES- 
8  satellite  sounder  radiances; 

•  Documentation:  Produce  beta  version  of  user’s  guide.  This  version  will  be 
comprehensive  enough  for  users  not  previously  familiar  with  the  4d-Var 
system  to  implement  it  with  minimal  effort. 

During  the  Beta  Test  Code  build,  AER  served  as  the  focal  point  for  the 
Problem/Change  Request  (P/CR)  process  outlined  in  the  Software  Development  Plan 
(SDP)  for  any  action  items  that  came  up  during  the  Alpha  Test  Review.  This  would 
normally  involve  coordinating  changes  to  both  code  and  the  SDL.  However,  the  majority 
of  the  effort  was  directed  at  optimization  of  the  ADJ  of  the  4d-Var  system.  As  with  the 
NLM  and  TLM  during  Alpha  Code  Build,  the  ADJ  was  also  placed  under  version  control 
(CVS)  and  configured  with  the  UNDC  make  and  make_mpp  utilities.  Functions  were 
added  to  the  baseline  4d-Var  code  that  permitted  access  to  and  quality  control  of  satellite 
radiance  data  that  will  serve  as  an  important  source  of  data  for  each  analysis.  During  the 
Beta  Test  Code  Build  AER  managed  the  activities  related  to  the  SDL. 

2.3.4.3  Initial  Operating  Code  Build 

After  the  Beta  test  review  is  concluded,  the  team  released  the  Beta  test  version  to  a 
broad  spectrum  of  test  users  for  Beta  testing,  or  use  in  a  more  application-oriented  mode, 
and  provide  feedback  on  any  residual  errors  or  functional  problems  and  deficiencies.  The 
CTA  and  project  leaders  reviewed  the  results  and  lessons  learned  during  the  Beta  test 
period  and  determined  what  functions  and  capabilities  should  go  into  the  final  Initial 
Operating  Capability  (IOC)  "version  1.0"  of  the  CHSSI  code.  The  team  added  these 
additional  functions,  incorporated  remaining  fixes  identified  during  Beta  test,  and  updated 
the  documentation.  When  the  code  and  supporting  documentation  and  processes  were 
fully  functional  on  two  or  more  HPC  platforms  and  the  code  was  ready  for  release  to  the 
general  DoD  community,  the  team  will  declare  itself  ready  for  IOC.  The  components  are 
spelled  out  in  the  Software  Development  Plan  and  are  summarized  below: 

•  Non-linear  Forecast  Model:  Incorporate  bug  fixes  encountered  in  Beta 
testing  and  physics  upgrades  to  the  4d-Var  system; 

•  Tangent-linear  Forecast  Model:  Incorporate  bug  fixes  encountered  in  Beta 
testing  and  physics  upgrades  to  the  4d-Var  system; 

•  Adjoint  Model:  Incorporate  bug  fixes  encoimtered  in  Beta  testing  and 
physics  upgrades  to  the  4d-Var  system; 

•  Satellite  Observation  Operator:  Incorporate  bug  fixes  encountered  in  Beta 
testing; 

•  Documentation:  Enhance  of  the  user’s  guide  based  on  Beta  user’s 
comments. 

During  the  IOC  Code  Build,  AER  was  responsible  for  managing  any  remaining  core 
programming  activities  related  to  the  adjoint  model  as  well  as  the  P/CR  process  for  any 
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action  items  that  might  arise  during  the  Beta  Test  Review  and  Beta  test  user  feedback. 
AER  also  continued  in  its  role  as  SCM  Manager  for  all  relevant  SDL  activities. 

23.4.4  Data  Acquisition  and  Quality  Control 

AER  was  responsible  for  identifying,  obtaining,  calibrating  and  validating  required 
satellite  radiance,  upper-air  and  supporting  data.  Data  sources  included  the  Air  Force 
Interactive  Meteorologieal  System,  and  national  archive  sites  maintained  by  NCAR. 

AER  utilized  tools  produced  in  a  separate  project  that  performed  a  quality  control  step  on 
the  GOES-8  soimder  data.  This  data  processing  step  eliminated  cloud-contaminated 
pixels  from  the  data  stream  and  removed  the  bias  between  the  measured  GOES-8 
brightness  temperatures  and  those  eomputed  from  collocated  radiosonde  data. 

2.3.4.5  Systems  Engineering  and  Program  Management 

AER,  in  its  role  as  Configuration  Manager,  provided  Configuration  Management 
(CM)  serviees  for  all  aspeets  of  the  CHSSI  4d-Var  code  development.  This  included  CM 
for  the  Software  and  Document  Libraries  during  the  development  of  reusable  software 
and  during  the  development  of  new  software.  The  components  are  spelled  out  in  the 
Software  Development  Plan  and  are  summarized  below: 

•  Planning:  Compose  the  SDP  and  TEMP  Addendum,  requirements  analysis, 
outlining  the  structure  of  any  software  that  needs  to  be  developed,  and 
periodic  adjustment  of  plans  based  on  how  the  project  is  proceeding; 

•  Configuration  Management:  Set  up  a  consistent  CVS  code  repository.  Set 
up  a  consistent  document  repository  for  controlling  all  relevant  project 
documents  in  addition  to  user  documentation.  Compose  and  implement  the 
Configuration  Management  Plan  that  contains  procedmes  for  the  use  of  the 
repository.  Include  policies  for  the  maintenance  of  the  repository  over  the 
life  of  the  project; 

•  Quarterly  and  Financial  Reporting:  Provide  all  required  CHSSI  reporting 
to  the  Program  Manager.  Financial  documents  were  archived  in  the  SDL. 

AER  provided  contraetual  and  other  software  and  documentation  releases  of  the 
CWO-5  eode,  following  established  proeedures  documented  in  the  AER  Quality 
Management  System. 

2.3.5  Migration  to  MM5v3 

The  current  MM5  forecast  model  is  now  up  to  Version  3  (MM5v3)  and  includes 
substantial  ending  and  physics  upgrades  over  MM5vl.  John  Michalakes,  our  technical 
consultant,  suggested  in  an  email  communication  after  the  Software  Acceptance  Review 
that  rather  than  introduce  changes  to  the  existing  MM5vl  4d-Var  to  make  that  system 
sealable,  the  CWO-5  development  team  should  instead  to  base  its  4d-Var  code  on 
MM5v3.  This  meant  that  the  development  team  would  have  to  create  updated  versions  of 
the  tangent-linear  and  adjoint  models  (TLM  and  ADJ,  respectively)  to  be  compatible  with 
the  latest  release  of  the  MM5  at  that  time.  Version  3.4.  The  final  decision  to  update  the 
TLM  and  ADJ  to  Version  3  weighed  feedback  from  potential  users  of  the  code  and  issues 
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related  to  the  maintainability  of  the  code.  The  structure  of  the  CWO-5  MM5v3  Adjoint 
Modeling  System  is  illustrated  in  Figure  2.  The  green  colored  boxes  refer  to  codes  that 
required  no  further  updates,  otherwise  known  as  “non-development  items.”  The  orange 
colored  boxes  represent  new  components  of  the  4d-Var  system  that  had  to  be  developed 
and  tested  for  correctness  for  this  project  in  order  to  attain  consistency  with  Version  3  of 
the  MM5.  The  methodology  for  developing  these  components  is  described  in  Section  3. 

After  the  HPCMO  approved  this  change  of  work  plan,  AER  produced  an  updated 
WBS  to  reflect  the  new  MM5v3  4d-Var  development  path  and  began  the  software 
process  for  the  development  of  new  code.  We  selected  a  development  tool  call  the 
Tangent-Adjoint  Model  Compiler  (TAMC;  http://www.autodiff  com/tamc/;  Geiring  and 
Kaminski  1998).  We  adopted  a  hybrid  approach  for  developing  new  software  with 
TAMC  (Figure  3):  First,  manual  changes  to  the  MM5v3  NLM  modules  eliminated  non¬ 
standard  Fortran  77  code  (e.g.,  POINTER  statements).  INCLUDE  statements  were  also 
removed.  Then,  TAMC  generated  the  TLM  and  ADJ  code.  The  POINTER  and 
INCLUDE  statements  were  returned  to  the  new  modules,  which  were  then  ready  for 
testing. 


Figure  2.  Schematic  Diagram  of  Incremental  4d-Var 
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CHSSI CWO-S  Alpha  Test  Review 


Project  Overview:  Development  &  Validation  of  CWO-5  Code 


9/6/04  9:33  AM  10 

Figure  3.  Illustration  of  Testing  Procedure  Adopted  in  CWO-5  TLM  and  ADJ  Code  Development 

2.3.6  Distributed  Memory  Architecture  Issues 

A  project  of  this  size  introduces  a  number  of  important  software  engineering  and 
programming  issues  relevant  to  this  project.  Fortunately,  most  of  these  issues  were 
addressed  during  the  development  of  the  Distributed  Memory  (DM)  version  of  the  MM5 
forecast  model.  These  were  outlined  in  a  document  entitled  Introduction  to  the 
Distributed  Memory-Parallel  MM5  System,  and  another  entitled  4d-Var  Driver 
Parallelization  Issues. 

2.3.7  Funding  Structure 

AFRL  was  awarded  a  CHSSI  grant  for  this  project  on  2  November  1999.  In  order  to 
expedite  progress  on  AER  tasks  and  responsibilities  outlined  in  Table  1,  AFRL  provided 
some  of  the  initial  funding  to  AER  on  a  previously  negotiated  contract  with  AER 
(F19628-96-C-0053).  This  allowed  AER  to  begin  work  on  some  of  the  initial 
documentation  requirements  soon  after  AFRL  received  funding  (February  2000).  The 
contract  between  AER  and  AFRL  negotiated  specifically  for  this  project  (F19628-00-C- 
0054)  was  signed  on  8  June  2000. 
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3.  METHODS  AND  PROCEDURES 


3.1  Requirements  Analysis 

The  requirements  analysis  process  develops  the  basis  for  the  entire  project  effort. 
CHSSI  project  proposals  are  a  first  step  in  this  process.  They  consider  how  existing  codes 
perform  on  new  HPC  platform(s)  and  where  their  difficulties/risks  have  been  and  will  be. 
From  this,  project  scientists  can  determine  to  some  level  of  detail  what  needs  to  be  done 
to  maximize  performance,  efficiency,  usability,  functionality,  scalability,  portability,  etc. 
This  analysis  forms  the  basis  of  the  project  proposal,  and  it  can  be  used  to  determine 
many  of  the  project's  test  requirements  later  in  the  process.  The  requirements  analysis 
determines  the  project  goals,  such  as  “be  able  to  model  a  10  million  atom  problem,  or 
“process  256x256-pixel  images  at  rates  up  to  45  frames  per  second  to  allow  real-time 
analysis.”  The  requirements  analysis  defines  the  project's  baseline  or  string  point-the 
amount  of  existing  code  from  which  you're  starting.  This  baseline  provides  the  HPCMO 
team  with  a  benchmark  against  which  to  measure  progress  throughout  the  course  of  the 
project. 

The  general  requirement  for  CWO-5  was  to  create  a  scalable  version  of  an  existing 
4d-Var  analysis  system  that  can  assimilate  satellite  radiance  data  and  is  computationally 
efficient  enough  to  run  in  real-time  on  a  massively  parallel  processing  supercomputer. 

3.2  Software  Development  Plan 

Project  teams  normally  spend  a  considerable  amount  of  time  defining  what  they 
want  to  accomplish  (their  goals)  and  planning  how  they  will  accomplish  those  goals.  A 
common  tool  used  to  record  the  results  of  this  planning  is  a  Software  Development  Plan 
(SDP),  which  details  the  major  tasks  or  phases  of  the  total  effort,  including  the  test 
activities  used  to  verify  progress  as  the  project  is  executed.  A  critical  step  in  the  definition 
of  the  SDP  is  to  define  the  functionality  envisioned  in  each  version  of  the  code  that's 
developed. 

This  SDP  was  developed  in  conformance  with  MIL-STD-498 
(http://www.software.org/quagmire/descriptions/mil-std-498.asp).  [Note  that  MIL-STD- 
498  was  cancelled  in  June  1998,  when  it  was  superceded  by  lEEE/EIA  12207.]  It  is 
structured  in  sections  following  the  format  and  content  provisions  of  Data  Item 
Description  (DID)  DI-IPSC-8 1 427.  Each  section  identifies  tailoring  applied  to  the 
structure  and  instructions  for  content  defined  in  the  DID.  The  structure  of  the  overall 
CHSSI  CWO-5  project  procedures  is  patterned  after  MIL-STD-1521 
(http://sparc.airtime.co.uk/users/wysywig/1521b.htm).  [Note  that  MIL-STD-1521  was 
cancelled  in  April  1995,  when  it  was  superceded  by  ML-STD-973 
(http://wwwedms.redstone.army.mil/edrd/973.html).]  The  general  contents  of  the  CWO-5 
SDP  are  described  in  Table  2. 

AER  worked  with  AFRL  to  complete  the  CWO-5  SDP. 
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Table  2.  Outline  of  the  CWO-5  Software  Development  Plan 


Section 

Description 

1 

Scope;  system  and  document  overviews;  relation  to  other  documents 

2 

Documents  referenced  by  this  SDP  and  used  dming  its  preparation 

3 

Overview  of  the  required  work,  including  work  breakdown  structure 
and  spend  plan 

4 

Plans  for  general  software  development  activities 

5 

Details  of  all  software  planning,  design,  development,  reengineering, 
integration,  test,  evaluation.  Software  Configuration  Management 
(SCM),  product  evaluation,  and  preparation  for  delivery  activities 

6 

Defines  the  project  schedule  and  activity  network  including  year  by 
year  deliverables 

7 

Describes  the  project  organization  and  the  resomces  required  to 
accomplish  the  work 

8 

Contains  the  acronyms  used  in  the  SDP 

3.2.1  Software  Development  Process 

The  software  development  plan  describes  a  process  to  develop  software  in  an 
incremental  series  of  builds,  beginning  from  a  pre-selected  reusable  software  code  (i.e., 
the  MM5vl  4d-Var  and  MM5v3  NLM).  The  CWO-5  software  team  developed  the  4d- 
Var  system  in  accordance  with  processes  defined  in  Section  5  of  the  SDP  in  the  context 
of  the  software  engineering  process  model  presented  in  Figure  4.  Note  that  CHSSI 
projects  generally  begin  with  established  codes,  i.e.,  codes  that  have  effectively  fiilfilled 
the  initial  processes  shown  in  Figure  4,  including  Software  Requirements,  Preliminary 
Design,  and  Detailed  Design.  The  CWO-5  process  integrated  reusable  software  from 
existing  sources  with  newly  developed  software.  Software  design  and  coding  was  the 
responsibility  of  the  Software  Development  Group  (generally,  AER,  NCAR,  and  FSU) 
using  an  object  oriented  design  approach.  AER  and  AFRL  defined  the  Test  Case 
Descriptions,  Test  Procedures,  and  conducted  the  Unit  and  Unit  Integration  code  tests. 
AER  prepared  a  Software  Test  Plan  (STP)  that  described  test  procedures  and  next 
executed  test  cases  defined  in  the  TEMP.  These  tests  generated  Software  Test  Reports 
(STRs)  that  described  results  of  both  imit  and  unit  integration  tests.  A  record  of  the 
activities  and  results  of  the  software  development  process  were  logged  in  the  Software 
Development  Files  (SDFs).  These  files,  along  with  other  pertinent  project  references  were 
deposited  and  maintained  in  the  SDL  and  made  available  to  support  management  reviews, 
metrics  calculations,  quality  audits,  product  evaluations,  and  preparation  of  product 
deliverables.  All  facets  of  the  software  engineering  process  are  under  configuration 
management  and  follow  AER’s  quality  management  procedures. 
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Software  Engineering 
Process  Model 


Figure  4.  Software  Engineering  Process  Model 


3.3  Test  and  Evaluation  Master  Plan  (Addendum) 

As  the  Software  Development  Plan  is  created,  the  test  plan  that  vdll  verify  the 
success  of  the  development  efforts  is  created  as  well.  Each  project’s  test  program  is 
summarized  in  an  addendum  to  the  CHSSI  Test  and  Evaluation  Master  Plan,  or  TEMP. 
The  TEMP  Addendum  contains  the  Measures  of  Effectiveness  and  Suitability  (MOE&S), 
or  metrics.  The  MOE&S’s  are  cross-referenced  to  the  Critical  Technical  Parameters 
(CTP)  necessary  to  successfully  achieve  the  MOE&S,  and  are  also  cross-referenced  to  the 
Critical  Operational  Issues.  The  CWO-5  CTPs  are  shown  in  Table  3.  TEMP  Addendum 
also  contains  Program  Management  Indicators  that  are  used  to  assess  the  overall 
performance  of  AFRL  and  collaborators  and  their  ability  to  develop  and  follovv 
reasonable  and  appropriate  procedures  for  managing  a  software  effort  of  this  size  and 
scope, 

AER  and  AFRL  together  created  the  CWO-5  TEMP  Addendum  document. 
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Table  3.  Critical  Technical  Parameters 


CRITICAL 

TECHNICAL 

PARAMETER 

TEST  EVENT 
(SAT,  ALPHA, 
BETA,  lOT&E) 

OBJECTIVES  (Target 
Values) 

THRESHOLDS  (Minimum 
Required  Values) 

Decision  Supported 

Scalable 
software  suites 

SAT 

•  Wall-clock  time  baseline 
established  for  single- 
processor 

•  Wall-clock  time  baseline 
established  for  single¬ 
processor 

Full  Scale 
Development 

ALPHA 

•  Wall-clock  time  reduced  by 
32  times  that  of  baseline  on 
non-linear  and  tangent- 
linear  components 

•  Wall-clock  time  reduced  by 

16  times  that  of  baseline  on 
non-linear  and  tangent-linear 
components 

Alpha  Release 

BETA 

•  Wall-clock  time  reduced  by 
32  times  that  of  baseline  for 
all  components 

•  Wall-clock  time  reduced  by 

16  times  that  of  baseline  for 
all  components 

Beta  Release 

lOT&E 

•  Wall-clock  time  reduced  by 
64  times  that  of  baseline  for 
all  components 

•  Wall-clock  time  reduced  by 

32  times  that  of  baseline  for 
all  components 

Milestone  III 

Portable 

application 

software 

SAT 

•  Codes  will  run  on  one  HPC 
platforms  producing  valid 
results 

•  Codes  will  run  on  one  HPC 
platform  producing  valid 
results 

Full  Scale 
Development 

ALPHA 

•  Codes  will  run  on  two  HPC 
platforms  with  same  valid 
results 

•  Codes  will  run  on  two  HPC 
platforms  with  same  valid 
results 

Alpha  Release 

BETA 

•  Codes  will  run  on  three 

HPC  platforms  with  same 
valid  results 

•  Codes  will  run  on  two  HPC 
platforms  with  same  valid 
results 

Beta  Release 

lOT&E 

•  Codes  will  run  on  three  or 
more  HPC  platforms  with 
same  valid  results 

•  Codes  will  run  on  two  EIPC 
platforms  with  same  valid 
results 

Milestone  III 
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Table3(Cont'd) 


CRITICAL 

TECHNICAL 

PARAMETER 

TEST  EVENT 
(SAT,  ALPHA, 
BETA,  lOT&E) 

OBJECTIVES  (Target 
Values) 

THRESHOLDS  (Minimum 
Required  Values) 

Decision  Supported 

Correctness 

SAT 

•  At  least  1  analysis  from  test 
case  produces  accurate, 
valid  output 

•  At  least  1  analysis  from  test 
case  produces  accurate,  valid 
output 

Full  Scale 
Development 

ALPHA 

•  At  least  2  analyses  from 
test  case  produce  accurate, 
valid  output 

•  At  least  1  analysis  from  test 
case  suite  produce  accurate, 
valid  output 

Alpha  Release 

BETA 

•  At  least  3  analyses  from 
test  case  produce  accurate, 
valid  output 

•  At  least  2  analyses  from  test 
case  produce  accurate,  valid 
output 

Beta  Release 

lOT&E 

•  Three  or  more  sub- 
problems  from  test  case 
suite  produce  accurate, 
valid  output;  satellite  data 
improves  RMSE  in 
forecasts  of  analysis 
dependent-variables 

•  At  least  3  sub-problems  from 
test  case  suite  produce 
accurate,  valid  output; 
satellite  data  improves  RMSE 
in  forecasts  of  analysis 
dependent-variables 

Milestone  III 

Correctness 

SAT 

•  Output  on  multiprocessor 
machine  runs  agrees  with 
single-processor  vector- 
class  benchmark  results  to 
within  5% 

•  Output  on  multi-processor 
machine  runs  agrees  with 
single-processor  vector-class 
benchmark  results  to  within 
10% 

Full  Scale 
Development 

ALPHA 

•  Output  from  multi¬ 
processor  runs  agrees  with 
single-processor  results  to 
within  1% 

•  Output  from  multi-processor 
runs  agrees  with  single¬ 
processor  results  to  within  5% 

•  Output  from  multi-processor 

Alpha  Release 

BETA 

•  Output  from  multi¬ 
processor  runs  agrees  with 
single-processor  results  to 
within  numerical  round-off 
error 

runs  agrees  with  single¬ 
processor  results  to  within  1% 

•  Output  from  multi-processor 
runs  agrees  with  single¬ 

Beta  Release 

lOT&E 

•  Output  from  multi¬ 
processor  runs  agrees  with 
single-processor  results 
exactly 

processor  results  to  within 
numerical  floating-point 
round-off 

Milestone  III 

3.4  Software  Acceptance  Test  Review 

Once  the  SDP  is  approved  and  the  code  is  developed  to  the  point  that  it  has 
achieved  some  initial  minimum  performance  levels,  it  is  preferable  to  hold  a  Software 
Acceptance  Test  (SAT)  review-the  first  major  technical  milestone  for  the  project.  The 
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SAT  is  a  review  of  the  scope,  plan,  problems,  direction,  “state-of-the-code,”  etc.  for  each 
effort.  At  the  review,  the  major  questions  are  “Is  it  a  sound  project  with  a  reasonable 
chance  of  success?”,  “Does  is  have  an  acceptable  level  of  risk?”,  “Do  we  know  where  we 
want  to  go  and  how  to  get  there?”,  and  “Will  the  project  provide  enough  added  HPC 
capability  to  the  DoD  to  make  it  worth  the  money?”  SAT  reviews  are  held  once  the  size 
and  scope  of  a  project  is  determined  and  the  feasibility  of  applying  a  project  to  HPC 
technology  can  be  discussed.  The  CWO-5  SAT  review  was  held  about  6  months  into  the 
project,  on  29  June  2000.  The  SAT  decision  authority  is  the  CTA  leader;  the  formal 
decision  at  SAT  is  whether  to  proceed  with  HPC  parallel  software  development.  If  the 
project  is  approved,  the  project  team  then  begins  the  Alpha  phase  of  development. 

The  HPCMO  requested  certain  pre-review  material,  which  the  CWO-5  team 
provided.  Next,  the  CWO-5  team  outlined  the  action  items  it  felt  were  necessary  in  order 
to  be  prepared  for  the  SAT.  The  CWO-5  SAT  was  held  29  June,  2000  at  AFRL  on 
Hanscom  AFB,  MA.  The  HPCMO  forwarded  its  approval  for  CWO-5  to  proceed  to 
Alpha  testing  on  6  July,  2000. 

3.5  Alpha  Test  Code  Review 

The  Alpha  test  review  is  the  second  major  milestone  on  CHSSI  projects.  It  is 
typically  held  1 5  to  1 8  months  into  a  3-year  project.  The  chairman  of  the  Alpha  test 
review  is  the  CTA  leader.  At  the  Alpha  test  review,  an  independent  panel  (panel  members 
can  be  members  of  the  HPCMO  st^  or  a  reviewer  external  to  the  project)  evaluates  the 
code(s)  against  the  test  criteria,  reviews  the  project's  internal  procedures  and  external 
interfaces,  and  provides  recommendations  to  the  chairman.  The  approval  authority  for 
Alpha  pass/fail  is  the  CHSSI  Project  Manager.  The  decision  at  Alpha  is  whether  the  code 
should  be  released  to  a  "friendly"  set  of  users  to  provide  their  impressions  of  the  code's 
functionality  and  usability.  The  project  team  will  provide  the  CHSSI  Project  Manager  a 
set  of  documentation  and  the  completed  testing  checklist  at  the  Alpha  test  review. 

If  the  project  is  approved  to  continue,  the  test  users  wring  out  the  Alpha  code 
functionality  and  usability,  identify  bugs,  inconsistencies,  confusing  points,  etc.  At  the 
same  time,  the  development  team  begins  the  Beta  code  development  activities  by 
integrating  new  functionality,  fixes,  and  changes  identified  during  the  Alpha  test  period 
and  incorporated  into  the  production  version  of  the  code.  During  this  phase,  the  project 
team  also  prepares  thorough  software  documentation.  When  the  code  has  achieved  “full” 
functionality  on  two  or  more  DoD  HPC  platforms  and  is  in  a  usable  state  by  the  CTA 
community,  the  team  will  schedule  the  Beta  test  review. 

The  Alpha  Test  Review  for  CWO-5  took  place  on  5  November,  2001  at  AFRL  on 
Hanscom  AFB,  MA.  In  advance  of  the  review,  the  CWO-5  team  put  together  a  plan  to 
address  all  outstanding  issues.  The  plan  maps  the  requirements  to  a  series  of  tasks  that  the 
CWO-5  team  had  to  complete  to  pass  the  review.  The  CWO-5  team  prepared  a  post¬ 
review  report  to  document  the  results  of  the  review  activities.  After  approval  from  the 
HPCMO,  an  Alpha  Release  was  assembled  and  made  available  to  Alpha  testers.  Results 
of  the  Alpha  Review  and  Alpha  Release  are  discussed  in  Section  4. 1 . 
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3.6  NCAR  Release 

The  success  of  the  CHSSI  program  is  judged,  in  part,  by  the  number  of  scientists 
and  researchers  who  benefit  from  the  scalable  codes  that  are  produced  by  the  projects.  In 
the  case  of  CWO-5,  our  primary  focus  was  on  AFWA;  however,  we  also  wished  for  the 
wider  meteorological  community  to  benefit  from  this  project.  The  CWO-5  team  felt  that 
if  the  CWO-5  4d-Var  application  were  approved  as  part  of  the  MM5  release  software,  the 
code  would  attain  a  legitimacy  beyond  what  we  could  otherwise  achieve.  The  CWO-5 
project  team  approached  NCAR  with  this  idea.  NCAR  agreed  to  do  this,  and  the  team 
delivered  a  “non-CHSSI”  release  to  NCAR  for  acceptance  testing.  The  release  contains 
user  documentation,  a  registration  form,  and  the  CWO-5  code  and  test  data. 

The  NCAR  release  was  similar  to  the  Alpha  Test  Code  release  in  that  it  was  not  yet 
fully  scalable,  so  testing  at  NCAR  was  confined  to  correctness  tests  of  the  serial  version. 
However,  the  NCAR  release  did  include  bug  fixes  to  the  Alpha  code.  Some  additional 
bugs  were  discovered  during  NCAR’s  acceptance  testing,  and  incorporated  into  the  Beta 
Test  Code. 

3.7  Beta  Test  Code  Review 

The  Beta  test  review  is  the  third  major  project  milestone.  It  is  typically  held  6-12 
months  after  the  Alpha  test  review,  or  about  30  months  into  a  3-year  project.  An 
independent  test  team  is  again  used  to  review  the  Beta  version  of  the  code  ^d  the 
documentation  prepared.  These  items  are  compared  to  the  criteria  set  forth  in  the  TEMP 
Addendum.  The  team  also  reviews  the  lessons  learned  and  procedures  used  during  the 
previous  Alpha  testing  period.  The  review  panel  for  Beta  test  review  consists  at  a 
minimum  of  the  CTA  leader  and  a  representative  from  the  HPCMO.  The  CWO-5  review 
panel  included  representation  from  NCAR.  The  decisions  at  Beta  are  whether  the  code  is 
ready  for  lOT&E  and  whether  to  release  the  formal  Beta  version  of  the  code.  The 
approval  authority  for  Beta  pass/fail  is  the  CHSSI  Project  Manager.  The  Beta  Test  Code 
usually  includes  a  set  of  documentation  and  the  completed  testing  checklist. 

The  TEMP  Addendum  Critical  Test  Parameters  for  the  Beta  Review  were  collected 
and  packaged  in  a  form  suitable  to  the  HPCMO.  The  HPCMO  received  the  CWO-5  Beta 
test  results  in  a  report  from  the  Program  Manager.  The  Beta  Test  Review  for  CWO-5  was 
held  on  1 8  September  2002  at  the  offices  of  the  HPCMO  in  Alexandria,  VA. 

After  the  Beta  test  review  concluded,  the  team  released  the  Beta  Test  Code  to  a 
broad  spectrum  of  test  users  and  provided  feedback  on  any  residual  errors  or  functional 
problems  and  deficiencies.  The  CTA  and  project  leaders  reviewed  the  results  and  lessons 
learned  during  the  Beta  test  period  and  determined  what  functions  and  capabilities  should 
go  into  the  final  Initial  Operating  Capability  (IOC)  “version  1.0”  of  the  CHSSI  code.  The 
team  added  these  additional  functions,  incorporated  remaining  fixes  identified  during 
Beta  test,  and  updated  the  documentation.  A^^en  the  code  and  supporting  documentation 
and  processes  were  fully  functional  on  three  or  more  HPC  platforms  and  the  code  is  ready 
for  release  to  the  general  DoD  community,  the  team  declares  itself  ready  for  IOC.  Results 
of  the  Beta  Test  Code  Review  and  Beta  Test  Code  Release  are  discussed  in  Section  4.2. 
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3.8  Initial  Operational  Capability  Version 

The  initial  guidance  from  the  HPCMO  indicated  there  would  he  an  lOT&E  review. 
However,  the  HPCMO  later  clarified  to  the  CWO-5  Program  Manager  that  the  Beta  Test 
Code  would  he  the  last  reviewable  release.  After  satisfactory  completion  of  any  Beta  test 
review  action  items,  the  CTA  leader  submits  a  memorandum  to  the  CHSSI  Project 
Manager  certifying  closure  of  the  action  items,  certifying  completion  of  proper 
documentation  and  supporting  procedures,  and  indicating  successful  closure  of  the 
project.  The  approval  authority  for  lOT&E  pass/fail  is  the  HPCMO.  Following  approval, 
the  IOC  software  is  released  to  the  full  user  community  for  operational  testing  and  use. 
The  features  included  in  the  IOC  version  are  the  observation  operator  for  GOES-8 
satellite  radiance  assimilation,  the  tropical  cyclone  bogus  data  assimilation,  the 
incremental  driver,  and  upgraded  physics  parameterizations  for  the  TLM  and  ADJ. 

One  of  the  obstacles  to  operational  implementation  of  4d-Var  algorithms  is  their 
large  computational  cost.  Compared  to  3d-Var,  each  iteration  of  the  minimization 
procedure  contains  an  additional  calculation  of  the  nonlinear  forecast  model  and  its 
adjoint.  Even  with  the  speedups  gained  through  parallelization  of  the  MM5  4d-Var  in  the 
CWO-5  CHSSI  project,  this  cost  is  prohibitive  for  most,  if  not  all,  current  MM5  forecast 
applications  at  AFWA.  A  variant  of  the  4d-Var  algorithm,  named  “incremental  4d-Var,” 
was  proposed  by  Courtier  et  al  (1994),  which  provides  for  significant  speedups  in  the 
minimization  procedure.  The  operational  implementation  of  4d-Var  at  the  ECMWF 
makes  use  of  Ae  incremental  formulation.  In  terms  of  the  unified  notation  of  Ide  et  al. 
(1997),  the  minimization  for  the  incremental  4d-Var  takes  the  form 


j[&(?o )]  =  k  (?o  )-  X®  )F  ®  o‘  )-  k  to )  -  X®  to )]} 


T 

^  i=0 


(3) 


where  the  increment  is  defined  as  x®  (/„).  The  TLM,  which  is 

linearized  about  the  NLM  forecast  from  the  guess  x®  (/j),  is  used  to  predict  values  of 
Sk(t)  ;  similarly,  a  linearized  version  of  the  observation  operator  is  used  in  the  evaluation 
of  (3).  The  solution  to  the  minimization  problem,  (/„),  is  used  to  obtain  an  updated 
value  of  X®  (/q),  and  nonlinear  effects  are  incorporated  through  performing  this 

procedure  in  a  number  of  outer  iterations.  A  schematic  of  this  process  is  shown  in  Figure 
2.  Further  approximations  are  usually  made  by  using  simplified  d5mamics  in  the 
linearized  model  and  its  adjoint,  and/or  by  using  decreased  resolution  (or  smaller  spectral 
truncation),  in  the  inner  loop.  Formally,  this  can  be  written  in  terms  of  a  linear 
simplification  operator  from  <5(x)  to  a  subset  of  gridpoints  (or,  spectral  modes),  ^w. 


Additional  features  that  were  to  be  incorporated  by  FSU  into  the  IOC  version  are 
the  Bogus  Vortex  Data  Assimilation  (BDA)  and  upgraded  physics  parameterizations.  The 
BDA  scheme  was  developed  by  FSU  and  incorporated  into  a  later  release  of  the  CWO-5 
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code.  It  permits  the  use  of  bogus  data  to  initialize  a  tropical  cyclone  vortex  (see,  e.g.,  Zou 
and  Xiao  2000).  Additional  physics  parameterizations  were  the  Grell  (1993)  cumulus 
convection  and  the  simple  microphysics  parameterization  described  by  Dudhia  (1989). 

3.9  4d-Var  Document  Repository 

Documentation  for  the  CWO-5  Alpha  and  Beta  Test  Code  releases  are  available  on 
the  software  distribution  disk.  Each  provides  information  on  the  background  of  the  CWO- 
5  4d-Var  application,  installation,  building  and  executing  the  code,  and  advanced  options 
such  as  making  modifications  and  testing.  The  distribution  disk  also  includes 
documentation  that  was  prepared  for  the  MM5vl  4d-Var  application,  and  a  general 
description  of  adjoint  techniques.  The  software  distribution  disk  is  available  from  AER. 

Inc. 

3.10  4d-Var  Code  Repository 

The  guidance  provided  by  the  HPCMO  at  the  begmmng  of  CWO-5  indicated  that 
the  development  processes  defined  for  each  CHSSI  project  should  “mirror”  Level  2  of  the 
Software  Engineering  Institute’s  (SEI)  Capability  Maturity  Model 
(http://www.sei.cmu.edu/cmm/).  Formal  software  development  practices  usually  require 
some  sort  of  version  control  procedures.  For  CWO-5,  we  implemented  version  control 
with  CVS.  The  CWO-5  CVS  repository  resides  on  a  computer  at  the  offices  of  ^R,  Inc. 
in  Lexington,  MA.  An  export  of  this  repository  is  included  on  the  distribution  disk. 
Instructions  on  using  the  CWO-5  4d-Var  are  included  in  the  software  delivery  component 
of  this  contract. 

3.11  Optical  Turbulence  Task 

Funding  to  support  AER,  Inc.'s  participation  in  the  HEL-JTO  project  became 
available  in  March  2002.  AER,  Inc.  hosted  a  kickoff  meeting  wiA  our  AFRL  HEL-JTO 
project  manager  and  technical  POC  (Frank  Ruggiero)  on  26  April  2002.  At  Aat  time,  Dr. 
Ruggiero  provided  AER  with  a  copy  of  PL-TR-93-2043,  "A  Model  for  (Optical 
Turbulence)  Profiles  Using  Radiosonde  Data"  by  Dewan  et  al.  (1993)  as  well  as  an 
introductory  slide  presentation  of  the  HEL  program,  goals,  and  objectives.  Note  that  after 
the  sensitivity  and  uncertainty  analyses  were  complete,  the  HEL-JTO  funding  situation 
changed  so  that  we  were  unable  to  proceed  with  the  Cl  observation  operator  and  4d-Var 

data  assimilation  studies. 

AER  supported  the  AFRL’s  HEL-JTO  project  through  the  development,  test,  and 
evaluation  of  an  observation  operator  for  .  We  first  outlined  a  techmcal  approach  for 
the  task.  AFRL  used  the  MM5  forecast  model  system  to  provide  input  for  the  Cl  model 

and  produce  forecasts  of  optical  turbulence.  The  forecasts  were  then  compared  to 
observations  from  a  thermosonde  database.  This  task  investigated  the  potential  utility  of 
cl  data  as  a  data  source  that  could  be  assimilated  with  the  CWO-5  4d-Var  application. 
The  objective  would  be  to  improve  the  quality  of  the  initial  lower  stratospheric  analysis 
by  incorporating  the  additional  information  that  the  thermosondes  could  provide.  Before 
we  could  do  this,  we  thought  it  would  be  wise  first  to  explore  the  potential  benefit  of 
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thermosonde  data  assimilation  through  a  series  of  analyses.  As  a  result,  the  optical 
turbulence  task  consisted  of  four  sub-tasks: 

3.11.1  Sensitivity  Analysis 

The  model  described  in  Dewan  et  al.  (1993)  uses  information  about  the  dynamic 

and  thermal  structure  of  the  atmosphere.  The  structure  is  described  by  multiple 
parameters  of  the  MM5  forecast  model.  When  using  the  variational  method  with  a 
forecast  system  like  MM5,  one  must  be  able  to  transform  the  model  dependent  variables 
into  the  observed  quantity.  Knowing  the  sensitivity  of  to  the  MM5  model  will  be 

useful  in  imderstanding  how  well  suited  this  data  source  will  be  for  data  assimilation.  We 
conducted  an  adjoint  sensitivity  analysis  of  the  optical  turbulence  profile  with  respect  to 
profiles  of  the  atmospheric  (or,  MM5)  variables  of  state  (u,  v,  T,  P).  The  sensitivity 
analysis  was  patterned  after  the  one  outlined  by  Xiao  and  Zou  (2001)  for  the  GOES-8 
radiance  observation  operator.  The  analysis  computes  the  relative  sensitivity,  or  fi'actional 
change  of  a  response  function  to  a  given  fractional  change  of  a  model  input. 


where  J  is  the  response  function,  VJ  is  the  output  of  the  adjoint,  and  &  is  a  perturbation 
vector. 

3.11.2  Uncertainty  Analysis 

It  will  be  useful  to  understand  how  the  errors  in  the  MM5  model  inputs  translate 
into  errors  in  the  C]  profiles.  The  error  covariance  matrices  in  (1)  determine  those 
structures  in  the  backgroimd  and  observations  that  ought  to  be  weighted  less  heavily  than 
others.  In  this  subtask  we  estimate  the  linear  propagation  of  errors  through  the 

observation  operator.  The  method  uses  the  TLM  and  AD  J  of  a  forecast  model  to  predict 
forecast  error  variances  from  an  initial  estimate  of  the  analysis  error  variance.  From  this 
the  analysis  will  provide  guidance  on  the  potential  for  the  data  to  have  a  positive 
impact  on  the  analysis. 

3.11.3  Observation  Operator 

If  the  uncertainty  analysis  in  Section  3.1 1.2  shows  that  the  Cl  data  contains 
information  that  would  be  beneficial  to  an  analysis,  then  it  might  be  advantageous  to 
proceed  with  the  development  of  a  Cl  observation  operator  that  would  relate  the  model 

state  to  optical  turbulence.  This  would  require  that  we  modularize  the  Dewan  model  code 
for  use  in  the  MM5v3  4d-Var  system  and  eventual  data  assimilation  studies,  and  test  the 
operator. 
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3.11.4  Data  Assimilation  Studies 


With  the  observation  operator  developed  and  tested  as  described  in  Section 
3.1 1 .3,  we  can  then  proceed  to  do  4d-Var  data  assimilation  studies  of  ,  where  the 
impact  of  the  optical  turbulence  data  is  measured  in  some  controlled  way. 

4.  RESULTS  AND  DISCUSSION 

The  results  of  the  CWO-5  project  were  demonstrated  during  a  series  of  reviews. 
These  results  are  summarized  below.  Material  of  more  interest  to  the  general 
meteorological  community  was  presented  at  several  conferences.  (See  Nehrkom  et  al. 
2001a,  Nehrkom  et  al.  2001b,  Ruggiero  et  al.  2001). 

4.1  Alpha  Test  Review 

The  CWO-5  team  successfully  passed  the  Alpha  Test  Review.  The  Critical  Test 
Parameters  for  the  Alpha  Review  and  the  CWO-5  results  are  presented  in  Table  4.  We 
conducted  tests  with  the  scalable  TLM  and  the  serial  4d-Var  applications.  Other  materials 
included  with  the  test  materials  are  the  test  drivers  and  results  for  the  TLM  and  4d-Var 
unit  integration  tests,  the  Alpha  Test  Code  Users’  Document,  and  the  CWO-5  internal  test 
plan  and  test  report  presented  to  the  HPCMO.  At  the  time  of  the  Alpha  Test  Review,  the 
CWO-5  4d-Var  code  was  not  yet  completely  scalable;  the  ADJ  code  was  still  undergoing 
test  and  evaluation.  In  order  to  show  conformance  with  the  CTP  for  speedup,  we 
conducted  tests  with  the  TLM  portion  of  the  CWO-5  code.  During  the  speedup  testing, 
we  discovered  that  I/O  was  the  main  impediment  to  speedup.  The  parallel  file  system  in 
use  on  the  test  IBM  SP  computer  system  was  not  optimally  handling  the  parallel  I/O.  Our 
NCAR  partner  (John  Michalakes)  determined  that  buffering  the  NLM  output  and  the 
TLM  input  caused  a  marked  increase  in  the  speedup  of  the  parallel  system.  Other 
techniques  were  limiting  the  read/write  operations  to  the  local  sub  domain,  and  using 
asynchronous  reads  and  writes.  This  problem/change  process  had  the  benefit  of  being 
applicable  to  later  development  of  the  parallel  ADJ.  At  the  time  of  the  Alpha  Test 
Review,  the  CWO-5  code  could  run  on  2  HPC  platforms  (SGI  Origin  and  IBM  SP-3). 

The  serial  4d-Var  code  results  (i.e.,  gradient)  agreed  to  within  13  digits.  The  CWO-5  code 
was  also  run  on  2  different  cases  at  the  time  of  the  Alpha  Review.  The  results  of  the  serial 
and  MPP  version  of  the  TLM  driver  test  differed  by  less  than  1  percent.  These  results  are 
available  in  detail  in  the  Alpha  Review  presentation  in  the  test  materials  file  on  the 
distribution  disk. 
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Table  4.  Critical  Test  Parameters  (CTPs),  the  Tested  Values  and  Test  Result,  and  the  Objective  and 
Threshold  Requirements  for  the  CWO-5  Alpha  Test  Review 


CTP 

Tested  Value 
&  Test  Result 

Evaluation 

Optimum 

Objectives 

Evaluation 
Minimum  Threshold 

Scalable  software 

Tested  Value:  20.9fTL]Vf> 

•  Wall-clock  time 

•  Wall-clock  time 

suites 

19.7tNLM) 

reduced  by  32 

reduced  by  16  times 

•  Demonstrate 

times  that  of 

that  of  baseline  on 

wall-clock  time 

Test  Result: 

baseline  on  non- 

non-linear  and 

speed-up  as  a 

linear  and  tangent- 

tangent-linear 

function  of 
increased  Central 
Processing  Units 
(CPU) 

_ ^Fails  to  meet  Minimum 

Threshold 

X  Meets  Minimum 

Threshold 

_ ^Meets  Optimum 

Objective 

linear  components 

components 

Portable  application 

Tested  Value:  2 

•  Codes  will  run  on 

•  Codes  will  run  on 

software 

two  HPC 

two  HPC  platforms 

•  Software 

Test  Result: 

platforms  with 

with  same  valid 

application 
fimctions  the 
same  and  produce 
similar  results, 
within  an 
acceptable  margin 
of  error,  on  a 
variety  of  scalable 
HPC  platforms. 

_ ^Fails  to  meet  Minimum 

Threshold 

_ ^Meets  Minimum 

Threshold 

X _ ^Meets  Optimum 

Objective 

same  valid  results 

results 

Correctness: 

Tested  Value:  2 

•  At  least  2  analyses 

•  At  least  1  analysis 

•  Run  software  on 

from  test  case 

from  test  case  suite 

multiple  analyses 

Test  Result: 

produce  accurate. 

produce  accurate. 

and  compare  with 
results  of  baseline 
code 

_ Fails  to  meet  Minimum 

Threshold 

_ ^Meets  Minimum 

Threshold 

2^ _ ^Meets  Optimum 

Objective 

valid  output 

valid  output 

Correctness: 

Tested  Value:  0% 

•  Output  from 

•  Output  from  multi¬ 

•  Insure  results 

multi-processor 

processor  runs 

from  MPP  runs 

Test  Result: 

runs  agrees  with 

agrees  with  single¬ 

agree  with  single 

single-processor 

processor  results  to 

processor  runs 

_ ^Fails  to  meet  Minimum 

Threshold 

_ Meets  Minimum 

Threshold 

X _ ^Meets  Optimum 

Objective 

results  to  within 

1% 

within  5% 
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4.2  Beta  Test  Review 

The  Beta  Test  Review  was  held  on  18  September,  2002  at  the  offices  of  the 
HPCMO  in  Alexandria,  VA.  The  Critical  Test  Parameters  for  the  Beta  Review  and  the 
CWO-5  results  are  presented  in  Table  5.  We  conducted  tests  with  the  scalable  4d-Var 
application,  which,  unlike  the  Alpha  Test  Code,  contained  the  scalable  ADJ.  Other 
materials  included  with  the  test  materials  are  the  test  results  for  the  4d-Var  unit 
integration  tests,  the  Beta  Test  Code  Users’  Document,  and  the  CWO-5  internal  test  plan 
and  test  report  presented  to  the  HPCMO.  Lessons  learned  during  the  Alpha  phase  of  the 
project  helped  to  produce  impressive  levels  of  speedup  in  the  4d-Var  application.  Our 
NCAR  partner  (John  Michalakes)  determined  that  buffering  the  NLM  output  and  the 
TLM  input  caused  a  marked  increase  in  the  speedup  of  the  parallel  system.  Other 
techniques  were  limiting  the  read/write  operations  to  the  local  sub  domain,  and  using 
asynchronous  reads  and  writes.  This  problem/change  process  had  the  benefit  of  being 
applicable  to  later  development  of  the  parallel  ADJ.  At  the  time  of  the  Alpha  Test 
Review,  the  CWO-5  code  could  run  on  2  HPC  platforms  (SGI  Origin  and  IBM  SP-3). 

The  serial  4d-Var  code  results  (i.e.,  gradient)  agreed  to  within  13  digits.  The  CWO-5  code 
was  also  run  on  2  different  cases  at  the  time  of  the  Alpha  Review.  The  results  of  the  serial 
and  MPP  version  of  the  TLM  driver  test  differed  by  less  than  1  percent.  The  log  of  the 
overall  TLM  test  and  development  was  maintained  in  the  TLM  Software  Development 
File. 

The  GOES-8  observation  operator  was  developed  and  tested  with  the  MM5v3  4d- 
Var  system.  Some  results  from  a  simple  test  case  are  presented  below.  The  GOES-8 
domain  from  which  the  test  data  was  selected  is  shown  in  Figure  5.  Note  on  this 
particular  day  there  was  a  large  amount  of  cloud  cover.  Since  only  data  from  cloud-free 
areas  could  be  used,  the  amount  of  data  actually  selected  for  assimilation  was  relatively 
small.  Figure  6  shows  the  temperature  difference  at  two  model  levels  between  the 
Aviation  Model  analysis  (AVN)  and  the  CWO-5  4d-Var  data  assimilation  that  included 
GOES-8  infrared  sounding  data.  Figure  7  is  the  same  as  Figure  6,  but  for  moisture.  Note 
that  the  largest  effect  is  over  the  relatively  cloud-free  region  near  southern  Georgia. 


PlWindow  1  -  gS, 02078.1 249  sound.chj  -  |  j| 


Figure  5.  Visible-Band  Data  from  GOES-8  Sounder  Channel  19  for  1249  UTC  on  19  March  2003. 
Black  Rectangles  Represent  Regions  of  Missing  Data. 
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Figure  7.  Same  as  Fig.  6,  but  for  Water  Vapor  Mixing  Ratio  (gkg'‘) 
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4.3  Optical  Turbulence  Task 

We  conducted  a  study  to  develop  an  optical  turbulence  observation  operator  that 
could  potentially  improve  forecasts  of  for  the  Air  Force.  The  study  first  explored  the 

suitability  of  the  C]  data  for  data  assimilation  in  a  system  like  the  MM5v3  4d-Var.  Two 
questions  we  wanted  to  answer  before  we  attempted  the  data  assimilation  were  “How 
sensitive  is  the  modeled  C]  quantity  to  the  input  MM5  variables?”  and  “What  is  the 

uncertainty  of  the  observed  Cl  compared  to  that  from  the  MM5  model?” 

We  were  provided  by  AFRL  a  thermosonde  dataset  collected  at  Vandenberg  AFB, 
CA  during  the  period  18-25  October  2001.  We  compared  the  Cl  data  directly  observed 

by  the  thermosonde  to  that  computed  from  radiosonde  values  of  u,  v,  T,  and  p  (the 
thermosonde  and  radiosonde  were  on  the  same  balloon).  We  began  with  an  accepted 
model  of  Cf  (Dewan  et  al.  1993).  The  object  was  to  see  how  well  correlated  directly 

observed  Cl  was  to  the  values  computed  by  the  Cl  model.  The  correlation  ranged  from 

0.84  to  0.49  with  an  average  correlation  of  0.69.  An  example  is  shown  in  Figure  8.  The 
correlation  for  this  particular  time  (02  UTC)  was  0.75. 

Having  an  idea  on  the  upper  limit  of  how  well  the  Cl  model  fit  the  data,  we  next 
focused  on  the  sensitivity  and  uncertainty  tests.  For  this  we  derived  the  TLM  and  ADJ. 
The  sensitivity  analysis  revealed  greatest  Cl  sensitivity  to  temperature  perturbations 
below  15-km  to  u-wind  speed  above  10-km.  A  plot  of  the  sensitivity  of  Cl  to  the  model 

variables  is  given  in  Figure  9.  Plots  for  the  other  time  levels  of  data  are  qualitatively 
similar  (not  shown). 

VAN  FA  002 
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Figure  8.  Observed  Thermosonde  Data  from  Vandenberg  AFB,  CA  vs.  Cl  Derived  from  MM5 
Forecast  Model  Output  Valid  at  the  Same  Time.  Correlation  =  0.75 
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Figure  9.  Sensitivity  of  as  a  Function  of  Height  to  Perturbations  in  Wind  (u,  v),  Temperature  (T), 

and  Moisture  (q) 

The  uncertainty  analysis  attempts  to  estimate  the  linear  propagation  of  errors 
through  the  observation  operator.  When  this  error  (uncertainty)  is  quantified,  it  can  be 
compared  to  the  observed  error;  if  the  observed  errors  are  smaller  than  the 

modeled  (the  observed  data  is  “better”  than  the  modeled  data),  then  we  can  presume 
that  the  introduction  of  observed  data  (e.g.,  from  a  data  assimilation  system)  will  improve 
the  analysis  of  variables  in  the  modeling  system.  An  example  of  this  comparison  is 
illustrated  in  Figure  10.  In  this  figure,  the  uncertainty  in  the  observations  is 
represented  by  the  spread  of  computed  for  different  models  of  the  optical  turbulence 

parameter.  The  dominant  contribution  to  the  observation  error  is  the  uncertainty  in  the 
forward  model.  The  modeled  uncertainty  is  represented  by  error  bars  and  is  overlaid  on 
the  cl  model  plots.  The  plot  suggests  that  for  this  case  from  5  to  15-km  above  ground 
level,  we  should  expect  a  measurable  improvement  in  the  analysis,  since  the  simulated 
cl  error  (due  to  the  NWP  model  forecast  error)  is  larger  than  the  uncertainty  in  the 
observation. 
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Figure  10.  Uncertainty  Analysis  Showing  Relative  Error  of  the  Various  Models  and  Model  State 
Error  Bars.  The  Information  Content  In  The  Data  Will  Reduce  the  Uncertainty  in  Those 
Regions  Where  the  Spread  of  the  Models  is  Less  Than  the  Width  of  the  Error  Bars.  (For  Error 

Magnitudes  That  Exceed  the  Forecast  Values  of  ,  the  Error  Bar  Extends  to  Negative  Infinity  on 
the  Logarithmic  Scale.) 

As  stated  in  Section  3.11,  our  original  plan  included  the  development  of  an  optical 
turbulence  observation  operator  and  integration  and  testing  of  the  operator  in  the  CWO-5 
4d-Var  system.  Due  to  a  loss  of  funds  from  the  High  Energy  Laser/Joint  Technology 
Office,  which  originally  supported  this  optical  turbulence  task,  we  had  to  end  our  studies 
after  the  data  suitability  task  described  in  this  section.  Therefore,  we  were  not  able  to 
develop  and  test  the  C]  observation  operator. 

5.  CONCLUSIONS 

The  work  conducted  for  this  contract  was  primarily  directed  toward  supporting  the 
AFRL  CWO-5  CHSSI  project.  That  project  was  focused  on  the  development  and  testing 
of  a  scalable  version  of  the  MM5  4d-Var  application.  At  the  start  of  the  project,  the  4d- 
Var  code  was  based  on  version  1  of  the  MM5.  Following  established  software 
development  procedures  and  after  consideration  of  numerous  factors,  the  CWO-5 
development  team  updated  the  MM5  4d-Var  code  to  version  3.  This  necessitated  the 
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development  and  test  of  a  new  code  for  the  MM5v3  TLM  and  ADJ.  The  CWO-5  code 
successfully  passed  the  Alpha  and  Beta  Test  Reviews.  The  CWO-5  code  was  distributed 
to  numerous  Alpha  and  Beta  test  users  and  has  been  used  extensively  in  R&D  projects  at 
AER,  including  projects  for  DoD  clients.  Additional  studies  examined  the  impact  of 
optical  turbulence  parameter  (i.e.,  C^)  assimilation.  The  preliminaiy  results  from  that 

study  suggest  that  data  have  the  potential  to  improve  upper-atmospheric  analyses  and 
the  NWP  model  forecasts  made  from  them  when  integrated  within  a  data  assimilation 
framework,  such  as  the  4d-Var  method. 
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